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Abstract 

The operations of data set, such as intersection, union and complement, are the fundamen- 
tal calculation in mathematics. It's very significant that designing fast algorithm for set opera- 
tion. In this paper, the quantum algorithm for intersection is presented. And its running time is 



O ( y\A\ x \B\ x \C\J for set operation C = AnB, while classical computation needs O (\A\ x \B\) 
steps of computation in general, where |.| denotes the size of set. The presented algorithm is the 
combination of Grover's algorithm, classical memory and classical iterative computation, and the 
combination method decrease the complexity of designing quantum algorithm. The method can be 
used to design other set operations also. 
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I. INTRODUCTION 



The operations of data set, such as intersection operation and union operation, are funda- 
mental calculation in mathematics. The fast computation of set operation is very important 
because it's the base of many sciences and techniques, such as database, image processing, 
signal processing. E.g, database search is based on set operation and the fast computation 
of set operation is very important for it. 

The computation procedure of set operation on electronic computer is illustrated as below. 

Suppose there are two vector sets A and B, 
A = {(1, 1,1,1,), (2,2,2,2), (1,2, 3,4)} , 
B = {(3, 3, 3, 3), (4, 4, 4, 4), (1,2, 3, 4)} , 
and the intersection set C — A D B — {(1, 2, 3, 4)}. 

Firstly, all vectors of set A (or B) are stored in electronic memory and each vector seems 
to be a record of database. The computation procedure of set operation A fl B is that, 
for every vector in set A, computer fully searches all elements in set B and matches it. 
Because sorting mult i- dimensional vectors is no useful for the speedup of search in general, 
all vectors of set are unsorted. Thus, the method of full search becomes the necessary choice 
to calculate intersection set for electronic computer, which is low efficient when set has huge 
size. 

In addition, the running speed of I/O (Input/Output) equipment of classical computer 
is the efficiency bottleneck in term of arbitrary classical algorithm l|. It's the computation 
procedure of classical computer that loading data into registers one by one via I/O, then 
executing calculation instructors one by one [1]. If a set has huge amount of elements, the 
process of loading data will waste the time heavily and it's an efficiency bottleneck. E.g., 
server computer for the database search is more expensive than personal computer, and one 
important reason is that advanced I/O is used. If the size of set is huge, set operation faces 
the bottleneck, and there is no way to overcome it on classical computation principle. 

Therefore, for the sets with huge size, electronic computer can do nothing for the require- 
ment of fast computation. We need new computation principle and new algorithm for set 
operation. 

Fortunately, in the last decade, quantum computation is studied and many surprising 
computation properties are revealed so that the research of quantum computation becomes 
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the one of hottest research topic currently. One of milestone of quantum computation 
researcher is Shor's algorithm for factoring an integer number with polynomial computation 
steps, which is believed to be classically impossible [2]. And Grover presented another 
exciting algorithm for database search in 1996. Only 0(\^N) steps of computation are 
cost by Grover's algorithm to find a marked element in an unsorted database with size N, 
while O(N) steps are needed for classical computer [3]. The possible speedup of quantum 
computation is essentially enabled by the feature of quantum parallelism. This parallelism 
computation of quantum computer bases on the superposition property of states, which is not 
possible on electronic computers [4j. It's the computation procedure of quantum computer 
that data is loaded into the superposition of states, the superposition is operated by special 
unitary operation, the amplitude of solution is increased, and solution is measured out with 
big probability at last. The simple quantum computers have already been constructed. For 
example, Shor's algorithm has been demonstrated by NMR quantum computer 5] and by 
optical quantum computer (fl to factorize the number 15. 

As well known, the elements of set are arbitrary data (or random data), the size of set is 
very big possibly, and all data are stored in electronic memory unorderly and temporarily. 
If we want to study the question that how to use quantum computation to perform set 
operation, there are three works must be considered at least. The first work is that how to 
express the information of a set using quantum state. The second work is that how to load 
the information of a set into quantum state from electronic memory. The matching function 
between two elements f c is a computation, such as judging if two vector is equal. And the 
third work is that how to embed the computation f c into quantum search algorithm. 

For the first work (i.e., how to express the set information using state), there are two 
expression methods currently. One of method is proposed by Latorre that the informa- 
tion of classical data is encoded in the amplitude of a state. Latorre used his method to 
expression image data and detail information is lost Q]. Latorre's method is useful for 
image compression, but it's not suitable to express the information of general set because 
distortion of information is not permitted in set operation. The other quantum expres- 
sion is proposed by Pang that all elements are regarded as sequence of database record 

/N-l 

{recordorecordi^ ^record N _i} and the entangled state -7=^ y~) \i) r egisteri\ r ^cordi) re gi S t er 2 
is used to express the information of set 



0, y, 



12. 



13, 





14| . Two registers are entangled 



in Pang's method, and it is no distortion theoretically. In addition, the operation of set is 
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equivalent to operating the entangled state. 

For the second work (i.e., how to load the set information into state from electronic 
memory), the conception of Quantum Loading Scheme (QLS) should be introduced. QLS 
is the unitary operation that loading all information of data set into quantum state from 
electronic memory. Nielsen and Chuang point out briefly that future quantum computer 
should have QLS {4, Section 6.5]. Vittorio Giovannetti, Seth Lloyd and et.al., present a 
simple QLS instance with few q ubits Q. Pang presents a QLS instance using the path 
interference of molecule [11]. Pan g's study shows that, for a vector a = {ai,ai, ...,ajv-i}, 



there is a unitary operation Uqls [U| such that 



UqlS • I fyregisterl \ tyregister2 \ O/^cMf^) register^ 
'N-l 



/irf I I register! \ ^i) register 1 ! ) | OilT'dHfJ''} registers 

V i=0 J 

, where classical data a^a^.. .Ojv_i are used as control signals to flip the particles. 
Figgis the illustration of QLS. 

The study of Seth Lloyd's group and Pang's study show also that QLS is fast and has 
running time O (/o^^O, while classical loading scheme via I/O has running time O (N) that 



is the efficiency bottleneck in term of arbitrary c 
variant QLS named unitary operation Ul as below 



assical algorithm. Pang also present a 



1 ( N ~ l \ 

Ul '■ 7/^ I \Vregisterl\fyregister2 j \cmcilla) 

1 f^ Zl \ 

( 5^ 1^) registerll&i) register2 j \cificill(l) 



The function of operation Ul is that loading data from electronic memory into two 
entangled registers according indices of data. 

In sum, Nielsen and Chuang points out that QLS has to be existed, and the study of 
Seth Lloyd's group and Pang's study both demonstrate the existence of QLS. 

For the third work (i.e., how to embed other computation into quantum search algorithm), 
the conception of the general Grover iteration should be introduced. Additional computation 
always goes with the search of database in general. E.g., suppose there is a database to save 
the student scores of many subjects. And if we want to find the student who has maximum 
average score, the additional computation f c that calculating average score is also needed. 
Famous Grover's algorithm can find a database record according to the given index, and it 
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FIG. 1: The Illustration of Quantum Loading Scheme (QLS): The function of QLS is to load all 
information of vector ~~a = {oi, oi, azv-i} into the superposition of states of quantum CPU from 
electronic memory efficiently. In QLS, classical data ai,a±, ...,azv— l are used as control signals to 
flip the particles. QLS has time complexity 0(log2N) and the I/O efficiency bottleneck of classical 
computer is broken by it. 

the base of many quantum search algorithms. However, Grover's algorithm is invalid for 
this kind of search, we need to improve Grover's algorithm. Pang presents a general Grover 



iteration for the search case with additional computation 
derived from the study of quantum image compression 0, 9 
The Grover iteration is defined as 0,0] 

G = (2\0(Z\-I)O f 
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which is 



N-1 



where Of is the oracle that flips the phase of state in Grover iteration, and l£) = -?=( ^ 
The General Grover Iteration (GGI) G genera i [si, 
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i=0 

14j is defined as 



= (2iexei - 1) iu L ) ] (o c y o f o c u L 

, where O c denotes the other computation oracle for additional function f c 
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FIG. 2: The Illustration of General Grover Iteration 
Similar to Grover iteration, G genera i act on initial state |£) = 

/N-l \ , x 

E l*> 

rpqifstpri \p) regi ster ?, , O I y N } times and the solution will be found if the 



^/jy I / i \ u / register i rgfi-, 
\ i=0 

solution is unique 8, 9, 
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Grover's algorithm is very useful, and many improved algorithms and many properties 



are studied by many experts H H Q, Il9l |20| . Boyer, Brassard, Hoyer, and Tap present 



n 



an improved algorithm named BBHT algorithm in this paper. 

Suppose there are sequence of data T[i] (0 < i < N). The various steps of the BBHT 
are: 

Stepl. Initialize T = 1 and A = 6/5 (Any value of A strictly between 1 and 4/3 would 
do.) 

Step2. Choose j uniformly at random among the nonnegative integers not bigger than 

r. 

N-l 

Step3. Apply j iterations of Grover's algorithm starting from the state \^>q) = h= |z). 

i=0 

Step4. Observe register: let %q be the outcome. 

Step5. If T[io] = x, the problem is solved, where T[i] is the sequence of data. And exit. 
Step6. Otherwise, set T to min{ XT, y/N} and go to step 2. 

The above BBHT algorithm is used to solve the case that the number of solutions t is 
unknown. BBHT algorithm requires that 1 < t < jN. If t > ~N, we can applied classical 
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full search method to got parts of solution efficiently, and call BBHT algorithm again. The 
case of no solution is handled by BBHT algorithm also. 

BBHT algorithm has time complexity O ( y 7- ) • And the probability of finding a solution 



t 

is bigger than | (i.e., after repeating BBHT twice or more, a solution will be found with 
probability 100% approximately). 

BBHT algorithm is the combination between quantum algorithm and classical iteration. 
And the benefit of this combination lies on many circuit are saved, while pure quantum 
algorithm will cost exponential numbers of circuit when the number of solution is unknown. 
BBHT algorithm will be used in this paper 4j. 

II. THE QUANTUM ALGORITHM FOR INTERSECTION OPERATION 
A. Unitary Operation and Data Structure 

Without losing generality, suppose that set is comprised by many vectors (or records) and 
let A = {a , ai...ajv-i}, where N = 2 n (otherwise, add special vector such that N = 2 n ). 
As the same, we have set B = j& , &1--&M-1 j, M = 2 m . 



The match function between two vectors is defines as 

fc (at, bj 



1 if a,i = bj 
otherwise 



The model of intersection operation C = An B is to find two records and bj such that 
Oi = bj (i.e., f c {oi, bj) = 1). we have the following data structure and unitary operation 
for this model. 

DS1. Save set A in electronic memory as a database, and each vector at is a record with 
unique index i. As the same, save set B, and each vector bj has index j. 
DS2. Construct five registers that have format 

\i)registerl\j)register2\ai ) re ^ er3 | bj ) register 4 fc^i , bj )) register^ 

That is, the 1st, 2nd, 3rd, 4th, 5th register is used respectively to save index i, index j, 
vector o|, vector bj and the value of match function f c . 
DS3. Initialize the five registers as zero value: 



1 0) register 1 1 0) register"! \ register^ \ register^ | 0) register^ 

DS4. Construct Hadamard transform: 

h : |o)|o)|o)|o)|o) — > -j== K)li>,|o}io}io) 

, where each ket denotes a register, not a single qubit. 
DS5. Construct Quantum Loading Scheme : 

1 /N-1M-1 \ 1 /N-lM-l \ 

The function of Ul is to load data into entangled state from electronic memory according 
to index. 

DS6. Design oracle O c to compute the value of match function f c : 



/N-XM-l \ /JV-lAf-l 

° c : 7^ EE 131^)1^)10) — EEJ3^)i^)i/ c 

v \ i=0 jr'=0 / \ i=0 j=0 

, where 

1 if at = bj 



otherwise 



DS7. Design oracle O/: 

/ N-lM-l , 

0/ : 7= £ £ l<>b'>l^>l&i>l/efe&i)> 



\ i=0 j=Q 



register^ 



N-lM-l 



' MN 



£ £ (-l) /(re9JSter5) \jy\j^\at)\bj)\f c (at, bj^j) register^ 



The above oracle Of is the oracle in Grover's algorithm 0, 4], which flips the phase of 
state. 

DS8. Construct General Grover Iteration G genera i'- 

G genera l = (2|£}(£| " /) (U L ? (O c ) f OfO c U L (3) 

The function of operation G genera i is equivalent to the Grover iteration (see Figj2]) . 
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If set C has unique element (i.e., |C| = 1), G genera i acting on 

(N-1M-1 \ , x 

E E |0> |0> |0> O ly/MN) times will generate intersection set. How- 

i=o j=o s — v — / y V / 

ever, the case |C| > 1 often happens, where |- (denotes the size of set. Therefore, we must 
design other improved algorithm to compute all elements of set C = A n B. 



B. Subroutine 1: Find An Element in Set C = A n B 
Stepl. Initial r = 1, A = |. 

Step2. Choose k uniformly at random among the nonnegative integers not bigger than 

r. 

Step3. Apply k times of general Grover iteration G genera i starting from the state 

(N-1M-1 \ 
E E N)U)|0)|0)|0) \\ancilla). 
i=o j=o ^-v^ y 

Step4. Observe the first and second register: let %q and jo be the output. 

Step5. If Oi — bj , return result io and j, and exit. 

Step6. Otherwise, set T to min and go to step2. 

Subroutine 1 is similar to BBHT algorithm, and the main different between the two 
algorithms is that Grover iteration is replaced by general Grover iteration G genera i. 

Similar to BBHT algorithm, we assume that 1 < \C\ < j\A\ x \B\ in this paper, where 
|.| denotes the size of set. If |C| > x \B\, we can applied classical full search method 
to got parts of solutions efficiently, and call subroutine 1 again. Similar to BBHT, the case 
An B = (i.e., empty set) is handled by subroutine 1. 

Similar to BBHT algorithm, subroutine 1 has time complexity O (y ^ni^ ) • 

Similar to BBHT algorithm, the output of subroutine 1 is a solution or not a solution. 
And the probability that output is a solution is bigger than |, and the probability that 
output is not a solution is less than |. Repeating subroutine 1 twice, a solution will be 
obtained. 

C. Quantum Search Algorithm for C = A n B (Q .Intersect ion) 

Stepl. C = (i.e., empty set) and nFlag = FALSE] Save all elements of set A and B 
in a database, each element is a record. And the database is in electronic memory. 
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Step2. while (nFlag = FALSE ) 
{ 

Step2-1. Call subroutine 1 to find a solution = bj Q ; 

Step2-2. If there is no output from subroutine 1, nFlag = TRUE; 

If (jz C, C < — C U {o^}. And update the database such that the records 
of vector a^ and bj are different from all records and the two vectors are also different. 
Continue. (Notice: The case a^, £ C will not happen in next calling subroutine 1 because 
database is updated.) 

} 

Step3. Call subroutine 1 to find a solution again. If there is no output from subroutine 
1, halt the algorithm. Otherwise, let nFlag = FALSE and go to step2. 



D. The Analysis of Time Complexity for QJntersection 

Conclusion: Algorithm QJntersection has time complexity O (y/\A\ x \B\ x \C\j , 
where |-| denotes the size of set. 

Proof: Similar to BBHT algorithm, the case |C| = is handled by this algorithm, and 
running time is 0(y MN). The following discussion is under the condition |C| > 1. 

Before firstly calling subroutine 1, there are t — \C\ — \AC\B\ numbers of unknown 
solutions. The sizes of set A and B are both constant during the whole calculation. Thus, 
the scale of problem is t — \C\. Suppose we need I t = I\c\ steps of computation to obtain all 
solutions. During the first computation of calling subroutine 1, C \J y^y steps of computation 
are cost, where c denotes a constant, \A\ — N and \B\ = M. The output of subroutine 1 
is a solution or not a solution. After executing step2-l (i.e., subroutine 1), two cases are 
happened. And the first case is that the output of subroutine 1 is a solution, and the second 
case is that the output is not a solution. The probability of the first case P caS ei is bigger 
than |, while the probability of the second case P caS e2 is less than \. That is, there is a real 
number e (0 < e < \) such that P caS ei = \ + e and P caS e2 — \ — e . When the first case 
happen, the scale of problem becomes t — 1 = |C| — 1, and I t _i = I\c\-i computation steps 
will be needed for all of remnant calculations. When the second case happen, the scale of 
problem is still |C|, and I t = Lqi computation steps will be needed again. 

When secondly calling subroutine 1, the situation is same. When the scale of problem 
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(i.e., the number of unknown solutions) is t, cy ^M- steps of computation are cost for calling 
subroutine 1, and the number of remnant steps is (| + e)I t ~i + (\ — s)I t - That is, 



MN .1 .1 

With the iterative computation increasing, the scale of problem t become smaller. When 
t = 1, 0(V MN) steps of computation will be cost by subroutine 1, i.e., I\ = C\VMN } 
where C\ is a constant. 

The time complexity can by analyzed by the above way. Therefore, the following recursion 
equation is obtained to calculate time complexity: 



It = c^Mf + (l + e)I t _ 1 + (±-e)I t 

h = Cl VMN (4) 
1 < t < \C\ ,0 < e < \ 

, where It denotes the number of computation steps when there are t number of unknown 
solution. 

By EqJH we have 

JMN 

h - h-i = 2c^^ - 2e(I t - 7 t _0 (5) 
Performing [l\ C \ - /|C|-i) + {l\c\-i ~ I\c\-2) + ■■■ (h - h), we have 



I\c\ ~h = 2cVMN(J2^) ~ Hl\o\ - 1 

i=2 V 1 



Because I\c\ > h and e > 0, we have 

|c| 



I ]cl -h< 2cV r MN( 



i=2 

That is, 



I\c\ <2cv / MiV(V4=)+^i 



i=2 

We have 
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\c\ 



I\c\ < 2cVMN( f-^dx) + h 

J v x 
1 

I\c\ < 4c^\C\MN + (cj - 4c) VMN 



i.e., 



ci - 4c. 



/| C |<(4c+^=)v^MiV 
V IM 



Because \C\ > 1, we have 



J| C | < ciVT^MiV if ci > 4c 



I\c\ < Ac^/\C\MN if ci < 4c 
Thus, there is a constant A > such that 

I\o\ < X^\C\MN 

That is, 

I\ C \=0(y/\A\x\B\x\C\) (6) 

Formula [6] shows that QJntersection algorithm has time complexity 
O \^/\A\ x \B\ x \C\j . That is, only O \^/\A\ x [S] x [C[J steps of computation are needed 
to calculate intersection set C = An B, while x \B\) steps are needed for classi- 

cal computation. The quantum algorithm QJntersection is fast than classical method by 
O (yj&ffl) factors. 

In addition, the probability of BBHT algorithm to find a solution is bigger than |. Similar 
to BBHT algorithm, The successful probability of subroutine 1 is bigger than |. That is, 
calling subroutine 1 twice can find a solution with 100% probability approximately. Step2 
and step 3 of QJntersection algorithm guarantee the successful probability is close to 100% 
approximately. 

Because A U B = I — A D B, the presented algorithm can be used to calculate union 
operation also. In sum, using the method of QJntersection to perform set operation is 
possible. 
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III. CONCLUSION 



The set operations, such as intersection, union and complement, are the fundamental 
calculation in mathematics. Set operation is the base of many sciences and techniques, 
such as database, signal processing and image processing. E.g., database is based on set 
operation. Designing fast algorithm for set operation is significant. 

Full search method is the common method of set operation in general because sorting 
multi-dimensional data is not very useful to improve running speed. Full search has time 
complexity O (\A\ x \B\) for the intersection C = A fl B, which is very slow still when the 
size of set is huge. Electronic computer loads data into register one by one from memory, 
and the efficiency bottleneck is formed. 

In this paper, the quantum search algorithm for intersection operation of set (named 
(^Intersection) is presented, which is the combination of Grover' algorithm, classical mem- 
ory, classical iteration. Using the method of (^-Intersection, the quantum algorithms for 
other set operations can be designed also. 

The advantages of (^Intersection are listed as below. 

1. (^Intersection has time complexity O [\/\A\ x \B\ x |C|j, while classical algorithm 
has time complexity 0(|A| x \B\), where |-| denotes the size of set. (^Intersection is fast 
than classical method by O {\J ~^$3\~ ^ • 

2. All information of data can be loaded into quantum state by O (log2 \A\ \B\) steps 
of computation in (^Intersection, and all data is loaded at a same time, while classical 
computer can only load data one by one and 0(\A\ \B\) steps are needed. And the efficiency 
bottleneck of electronic computer is evaded. 

3. In step2-2 of (^Intersection, the data in the electronic memory is often updated 
according to the output of quantum algorithm, which simplifies the design of quantum 
algorithm. As well known, data in the superposition of states can not be updated as a given 
number and can not be measured when unitary operation acting on this superposition. 
This defect make designing quantum algorithm very difficult. (^Intersection shows that 
the combination between quantum algorithm and classical memory is useful to decrease the 
complexity of designing quantum algorithm. 
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